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Abstract 

We introduce a novel concept, the minimal molecular surface (MMS), as a new paradigm 
for the theoretical modeling of biomolecule-solvent interfaces. When a less polar macro- 
molecule is immersed in a polar environment, the surface free energy minimization occurs 
naturally to stabilizes the system, and leads to an MMS separating the macromolecule 
from the solvent. For a given set of atomic constraints (as obstacles), the MMS is defined 
as one whose mean curvature vanishes away from the obstacles. An iterative procedure 
is proposed to compute the MMS. Extensive examples are given to validate the proposed 
algorithm and illustrate the new concept. We show that the MMS provides an indication 
to DNA-binding specificity. The proposed algorithm represents a major step forward in 
minimal surface generation. 
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The stability and solubility of macromolecules, such as proteins, DNAs and RNAs, are 
determined by how their surfaces interact with solvent and/or other surrounding molecules. 
Therefore, the structure and function of macromolecules depend on the features of their 
molecule-solvent interfaces [1 j. Molecular surface was proposed EI to describe the inter- 
faces and has been applied to protein folding 0j, protein-protein interfaces 0J, protein surface 
topography 1 , oral drug absorption classification 6 , DNA binding and bending [Jj, macro- 
molecular docking jSj, enzyme catalysis [Hj, calculation of solvation energies [T0|, an d molecular 
dynamics [TT]. It is of paramount importance to the implicit solvent models However, 
the molecular surface model suffers from it being probe dependent, non-differentiable, and 
being inconsistent with free energy minimization. 

Minimal surfaces are omnipresent in nature. Their study has been a fascinating topic for 
centuries [HI [02 EH. French geometer, Meusnier, constructed the first non-trivial example, 
the catenoid, a minimal surface that connects two parallel circles, in the 18th century. In 1760, 
Lagrange discoved the relation between minimal surfaces and a variational principle, which is 
still a cornerstone of modern mechanics. Plateau studied minimal surfaces in soap films in the 
mid-nineteenth century. In liquid phase, materials of largely different polarizabilities, such as 
water and oil, do not mix, and the material in smaller quantity forms ellipsoidal drops, whose 
surfaces are minimal subject to the gravitational constraint. The self-assembly of minimal 
cell membrane surfaces in water has been discussed [E]. The Schwarz P minimal surface is 
known to play a role in periodic crystal structures 18 J. The formation of /3-sheet structures in 
proteins is regarded as the result of surface minimization on a catenoid [HI|. A minimal surface 
metric has been proposed for the structural comparison of proteins [20] • However, to the best 
of our knowledge, a natural minimal surface that separates a less polar macromolecule from 
its polar environment such as the water solvent has not been considered yet. The objective of 
this Report is to introduce the theory of and algorithm to generate minimal molecular surfaces 
(MMSs). Since the surface free energy is proportional to the surface area, a MMS contributes to 
the molecular stability in solvent. Therefore, there must be a MMS associated with each stable 
macromolecule in its polar environment. Although minimal surfaces are often generated by 
evolving surfaces with predetermined curve boundaries \21\ I22j. there is no algorithm available 
that generates minimal surfaces with respect to obstacles, such as atoms. Here, we develop 
such an algorithm based on the theory of differential geometry |23| . 

For a given initial function S{x 1 y 1 z) that characterizes domain encompassing the biomolecule 
of interest, we consider an evolution driving by the mean curvature H 

S E {x, y, z) = S(x, y, z) + £y/gH, (1) 

where e > is a small parameter, g = 1 + + Sy + S% is the Gram determinant, and H = 
|V - (~^§) • Our procedure involves iterating Eq. (QJ until H ~ everywhere except for certain 
protected boundary points where the mean curvatures take constant values. Physically, the 
vanishing of the mean curvature is a natural consequence of surface free energy minimization. 
Consider the surface free energy of a molecule as E — J dM cr(x,y, z)dQ, where dM is boundary 
of the molecule, a the energy density and dVt — y/gdxdydz. The energy minimization via the 
first variation leads to the Euler Lagrange equation, 
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Figure 1: MMS generation, (a) Illustration of r, r, and L at a cross section z = 0; (b) 
S(x, y, z = 0) shows a family of level surfaces, (c) The isosurface extracted from S = 0.9950 

where e = cr^fg. For a homogeneous surface, a — ctq, a constant, Eq. © leads to the vanishing 
of the mean curvature ctq V • ( ) = 3aoi? = 0. 



For a given set of atomic coordinates, we prescribe a step function initial value for S(x, y, z\ 
i.e., a non-zero constant So inside a sphere of radius r about each atom and zero elsewhere. 
Alternatively, a Gaussian initial value can be placed around each atomic center. The value 
of S(x,y,z) is updated in the iteration except for at obstacles, i.e., a set of boundary points 
given by the collection of all of the van der Waals sphere surfaces or any other desired atomic 
sphere surfaces. Here H and g can be approximated by any standard numerical methods. 
For simplicity, we use the standard second order central finite difference. Due to the stability 
concern, we choose e < where h is the smallest grid spacing. The MMS is differentiate, 
probe independent, and consistent with the surface free energy minimization. 

As a proof of principle, we illustrate our ideas by a few examples. We first test the proposed 
method for the MMS of a diatomic molecule. The atomic radius is r and their central distance 
is L. First, we consider a step function initial value with r > L/2 > r, see Fig. [l] (a) for 
an illustration. Of course, the steady state solution of S(x,y,z) does not directly provide a 
surface. Instead, it gives rise to a family of level surfaces, which includes the desired MMS. It 
turns out that S{x,y,z) is very flat away from the MMS, while it sharply varies at the MMS. 
In other word, S{x 1 y 1 z) is virtually a step function at the desirable MMS, see Fig. [l] (b). 
Therefore, it is easy to extract the MMS as an isosurface at S{x 1 y 1 z) — C as shown in Fig. 
d (c). It is convenient to choose C = (1 — 5) So, where S > is a very small number and can 
be calibrated by standard tests. Computationally, by taking So = 1000, satisfactory results 
can be attained by using 5 values ranging from 0.004 to 0.01. We next test if there is any 
initial value constraint in our method. Indeed, it is found ifr<r<L/2, two isolated spheres 
are obtained instead of the MMS. Therefore initial connectivity (r > L/2) is crucial for the 
formation of MMSs. 

It is important to know whether the initially connected S(x, y, z) could eventually separate 
into two regions when L is sufficiently large. The lower bound and the upper bound of the 
MMS areas are Aur 2 and 87rr 2 , respectively for the diatomic system. When L is small, the 
MMS consists of a catenoid and parts of two spheres and the MMS area is smaller than the 
upper bound, see Fig. [21(a). When the separation length L is gradually increased, the MMS 
area grows, while the neck of the MMS surface becomes thinner and thinner, see Fig. [2] (b). 
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Figure 2: The MMS of a diatomic molecule at different separation lengths L. (a) L = 2. Or; 
(b) L = 2.4r; (c) L = 2.45r. 




Figure 3: The MMS of benzene with van der Waals radii and scaled atomic radii, (a) Van der 
Waals radii, tq — 1.7 A and rn = 1.2 A; (b) Atomic radii, tq — 0.7 A and rn = 0.38 A; (c) 
Scaled atomic radii, tq = 0.63 A and rn = 0.34 A; (d) Scaled atomic radii, tq = 0.56 A and 
r H = 0.30 A. 

At a critical distance L c > 2r, the MMS area reaches the upper bound 87rr 2 and the MMS 
breaks into two disjoint pieces. The present study predicts L c ~ 2.426r. In fact, this result is 
initial-value independent as long as f > kf- — 1.213r. We found that a Gaussian initial value 
gives the same prediction. As the molecular surface area is proportional to the surface Gibbs 
free energy, the critical value (L c ) might provide an indication of the molecular disassociation 
and could be used in molecular modeling. 

We next consider the MMS of the benzene molecule which consists of six carbon atoms and 
six hydrogen atoms. The carbon atoms are in sp 2 hybrid states with delocalized tv stabilization. 
The MMSs of the benzene molecule with van der Waals radii (r V( jw) an d other atomic radii are 
depicted in Fig. 09 By using the van der Waals radii, a bulky MMS is obtained. A topologically 
similar while smaller MMS is formed using the set of standard atomic radii. No ring structure 
is seen until the atomic radii are reduced by a factor of 0.9, see Fig. 09(c). Clearly, all atoms 
are connected via catenoids. Eventually, the MMS decomposes into 12 pieces when radii are 
further reduced to slightly below their critical values, see Fig. 09(d). This again conforms our 
prediction of separation critic L c ~ 2.426r. 

Finally, we employ our MMS to study the mechanism of molecular recognition in protein- 
DNA interactions. NMR and molecular dynamics studies suggest that antennapedia achieves 
specificity through an ensemble of rapidly fluctuating DNA contacts [21]. While X-ray structure 
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Figure 4: The MMS and the MS at the contacting regions of antennapedia homeodomain 
DNA complex (PDB ID: 9ant). The MMSs of the DNA and the antennapedia homeodomain 
are shown in (a) and (c), respectively, while the MS ones are shown in (b) and (d), respectively. 

indicates a well-defined set of contacts due to side chains constraints (2S1- In the present work, 
we reveal flat contacting interfaces which stabilizing the protein-DNA complex. Figs. Ufa) 
and 0f c) depict the MMSs of antennapedia and DNA (PDB ID: 9ant), generated by using 
r = 1.3r v dw an d h = 0.2 A. Clearly, the binding site of the DNA (middle groove) has a large 
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facet, which is absent from the top and bottom grooves of the DNA. Interestingly, the MMS 
of the protein exhibits a complimentary facet. For a comparison, the molecular surfaces (MSs) 
generated by using the program MSMS 26 with the same set of van der Waals radii and a probe 
radius of 1.5 A are depicted in Figs. 01(b) andQ](d). Apparently, it is very difficult to recognize 
the complementary binding interfaces from MSs. It is interesting to note that the MMS also 
better reveals the skeleton of the DNA's double helix structure. To quantitate the affinity at 
the contacting site, we compute the mean distance between the MMSs of the protein and DNA 
by using about 7200 surface vertices over the binding domain. A small mean distance of 0.4054 
A unveils a close contact between two facets. Relatively small standard deviation of 0.3401 
A indicates the smoothness of the contacting facets. In contrast, inconclusive mean (0.8697 
A) and standard deviation (0.5818 A) were found from the corresponding MSs. This study 
indicates the great potential of the proposed MMS for biomolecular binding sites prediction 
and recognition. 

We have introduced a novel concept, the minimal molecular surface (MMS), for the model- 
ing of biomolecules, based on the speculation of free energy minimization for stabilizing a less 
polar molecule in a polar solvent. The MMS is probe independent, differentiable, and consis- 
tent with surface free energy minimization. A novel hypersurface approach based on the theory 
of differential geometry is developed to generate the MMSs of arbitrarily complex molecules. 
Numerical experiments are carried out on few- atom and many- atom systems to demonstrate 
the proposed method. It is believed that the proposed MMS provides a new paradigm for the 
studies of surface biology, chemistry and physics, in particular, for the analysis of stability, 
solubility, solvation energy, and interaction of macromolecules, such as proteins, membranes, 
DNAs and RNAs. It has potential applications not only in science, but also in technology, 
such as vehicle design and packaging problems. 
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A Supplement: Derivation of the mean curvature evolution 
equation 

Consider a C immersion /:£/—>■ R 4 , where U C M 3 is an open set. Here f(u) = 
(fi(u), f2(u), fz(u), /^(u)) is a hypersurface element (or a position vector), and u — (u\, U2, u%) G 
U. 

Tangent vectors (or directional vectors) of / are = J^-. The Jacobi matrix of the 
mapping / is given by Df = (X 1 ,X 2 ,X 3 ). 

The first fundamental form is a symmetric, positive definite metric tensor of /, given by 
I '•— (dij) — (Df) T • (Df). Its matrix elements can also be expressed as gij —< Xi,Xj >, 
where <, > is the Euclidean inner product in R 4 , i,j = 1, 2, 3. 

Let v{u) be the unit normal vector given by the Gauss map v : U —> S 3 , 

iy( Ul ,u 2 ,u 3 ) := Xi x X 2 x X 3 /||Xi x X 2 x X 3 || G ± u f, (3) 

where the cross product in R 4 is a generalization of that in R 3 . Here -L u f is the normal space 
of / at point p = f(u). The vector v is perpendicular to the tangent hyperplane T u f at p. 
Note that T u f © ± u f = Tj( u )R 3 , the tangent space at p. By means of the normal vector v and 
tangent vector JQ, the second fundamental form is given by 

Ili^Xj) = (h^) = (< -^,Xj >). (4) 

The mean curvature can be calculated from 

H = \hi j9 j \ (5) 

where we use the Einstein summation convention, and g l i — g^ 1 . 

Let U C R 3 be an open set and suppose U is compact with boundary dU . Let f £ : U — > R 4 
be a family of hypersurfaces indexed by e > 0, obtained by deforming / in the normal direction 
according to the mean curvature. Explicitly, we set 

f e (x, y, z) := f(x, y, z) + eHu(x, y, z). (6) 
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We wish to iterate this leading to a minimal hyper surf ace, that is H = in all of U, except 
possibly where barriers (atomic constraints) are encountered. 

For our purpose, let us choose f(u) = (x, y, z, S), where S(x, y, z) is a function of interest. 
We have the first fundamental form: 

(1 + S% S x S y S X S Z 
S X Sy l + Sl SyS Z ) . (7) 

The inverse matrix of (gij) is given by 



*>xS Z SyS Z 1 + S z 



1 + Sy + S% S x S y S X S Z 
(g^) = -\ -S x S y l + Sl + Sl -S y Sz 2 |, (8) 

~S X S Z —S y S z 1 + S x + Sy 

where g = Det(g^) = 1 + + Sy + is the Gram determinant. The normal vector can be 
computed from Eq. Q 

v (S x , -S y , -S Z1 l)/y/g, (9) 
The second fundamental form is given by 



1 



(M = ( —S XiXj ) , (10) 



i.e., the Hessian matrix of S. 

We consider a family f £ = (x,y,z,S £ ), where 



S e (x, y, z) = S(x, y, z) + eH^—. (11) 



The explicit form for the mean curvature can be written as 



Thus, we arrive at the final evolution scheme 

S e (x,y,z) = S(x,y,z) + ^-V ■ (^0 . (13) 

To balance the growth rate of the mean curvature operator, we replace H by gH, which is 
permissible since g is nonsingular. This leads to Eq. (QJ of the main text. 
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